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[57] ABSTRACT 

A text normalizer normalizes text that is output from a 
speech recognizer. The normalization of the text produces 
text that is less awkward and more familiar to recipients of 
the text. The text may be normalized to include audio 
content, video content, or combinations of audio and video 
contents. The text may also be normalized to produce a 
hypertext document. The text normalization is performed 
using a context-free grammar. The context-free grammar 
includes rules that specify how text is to be normalized. The 
context-free grammar may be organized as a tree that is used 
to parse text and facilitate normalization. The context-free 
grammar is extensible and may be readily changed. 

50 Claims, 16 Drawing Sheets 
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TEXT NORMALIZATION USING A 
CONTEXT-FREE GRAMMAR 

TECHNICAL FIELD 

The present invention relates generally to data processing 
systems, and more particularly, to text normalization using 
a context-free grammar. 

BACKGROUND OF THE INVENTION 

Speech recognizers have gained popularity in recent 
years. A speech recognizer typically includes software that 
is run on a computer system to recognize spoken words or 
phrases. The speech recognizer generally outputs text cor- 
responding to its interpretation of the spoken input. For 
example, if a speaker speaks the word "dog," the speech 
recognizer recognizes the spoken word and outputs the text 
"dog/' 

Unfortunately, speech recognizers often produced textual 
output that is awkward or not familiar to recipients. For 
example, if a speaker speaks the phrase "one hundred forty 
seven/', the speech recognizer outputs "one hundred forty 
seven" rather than the sequence of digits "147." Similar 
awkward textual outputs are produced by speech recognizers 
for inputs that specify dates, times, monetary amounts, 
telephone numbers, addresses, and acronyms. As a result, 
the recipient of the textual output is forced to manually edit 
the text to put it in a more acceptable form. As speech 
recognizers are being incorporated in document creation 
software, the inability of the speech recognizers to produce 
acceptable textual output substantially diminishes the use- 
fulness of such software. 

SUMMARY OF THE INVENTION 

The present invention overcomes the limitation of prior 
art speech recognizers by providing a facility for normaliz- 
ing text. The normalization of text produces output text that 
is more acceptable to recipients. The normalization may also 
include the substitution of textual content with non-textual 
content, such as audio content, video content, or even a 
hypertext document. 

In accordance with a first aspect of the present invention, 
a method is practiced in a computer system that has a speech 
recognition engine for recognizing content in an input 
speech. Text corresponding to speech input is received from 
the speech recognition engine by the computer system. A 
context-free grammar is applied to identify substitute con- 
tent for the received text. The receive text is substituted with 
the substitute content. 

In accordance with another aspect of the present 
invention, a file is provided in a computer system to set forth 
rules of a context-free grammar for normalizing text. Text is 
received from a speech recognizer that recognizes portions 
of speech in speech input. The text corresponds to speech 
input. At least a portion of the text is normalized to replace 
the portion with a normalized alphanumeric string 
("alphanumeric** as used in this context is intended to 
include ASCII and Unicode). The normalizing comprises 
applying a rule from the context-free grammar to replace the 
portion of the text being normalized with the normalized 
alphanumeric string. 

In accordance with an additional aspect of the present 
invention, an application program interface (API) that 
includes a text normalizer is provided within a computer 
system. The computer runs an application program and 
includes a speech recognizer for recognizing portions of 
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speech in speech input and for outputting text that corre- 
sponds to the recognized portions of speech. Text is received 
from the speech recognizer at the text normalizer. The text 
is normalized by the text normalizer by applying a rule from 

5 the context-free grammar to alter contents of the text and 
produce normalized text. The normalized text is passed to 
the application program. 

In accordance with a further aspect of the present 
invention, a computer system includes a speech recognizer 

10 for recognizing portions of speech in speech input and for 
producing textual output corresponding to the recognized 
portions of speech. The computer system also includes a 
context-free grammar that contains rules for normalizing 
text and a text normalizer that applies at least one rule from 

15 the context-free grammar to normalize textual output from 
the speech recognizer, 

BRIEF DESCRIPTION OF THE DRAWINGS 

2Q A preferred embodiment of the present invention will be 
described below relative to the following figures. 

FIG. 1 is a block diagram illustrating a computer system 
that is suitable for practicing the preferred embodiment of 
the present invention. 
25 FIG. 2 is a block diagram illustrating a distributed system 
that is suitable for practicing the preferred embodiment of 
the present invention. 

FIGS. 3A-3E illustrate the data flow between the speech 
recognizer, the text normalizer, and the application programs 
30 for different types of normalization. 

FIG. 4 illustrates the logical format of the text file that 
holds the context-free grammar. 

FIG. 5 depicts the categories of other rules that are set 
35 forth within the text file of FIG. 4. 

FIG. 6 is a flow chart illustrating the steps that are 
performed to use the text file for normalizing text. 

FIG. 7 depicts an example portion of the tree for the 
context-free grammar. 
4° FIG. 8 is a flow chart illustrating the steps that are 
performed to determine when to apply a rule from the 
context-free grammar. 

FIG. 9 depicts an example of normalization of a portion 
of text. 

45 

FIG. 10 is a flow chart illus tratrng the steps that are 
performed for an application program to receive normalized 
text. 

FIG. 11 is a flow chart illustrating the steps that are 
50 performed to replace one context-free grammar with 
another. 

FIG. 12 is a flow chart illustrating the steps that are 
performed to edit a context-free grammar. 

DETAILED DESCRIPTION OF THE 
INVENTION 

The preferred embodiment of the present invention pro- 
vides a mechanism for normalizing text that is received from 
a speech recognizer. A context-free grammar is applied to 

60 perform the text normalization. The context-free grammar 
includes a number of rules that specify how the text is to be 
normalized. These rules are applied to textual output 
received from the speech recognizer to produce normalized 
text. In the preferred embodiment of the present invention, 

65 the text normalization is performed within an application 
program interface (API) that may be called by application 
programs to receive text corresponding to speech input. 
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The preferred embodiment of the present invention may 
provide multiple types of text normalization. For example, 
text may be normalized to produce normalized text. 
Similarly, text may be normalized to produce different types 
of media content. Text may be normalized to produce audio 
content and video content. Text may even be normalized to 
produce hypertext documents that are substituted for the 
text. 

The context-free grammar utilized in the preferred 
embodiment of the present invention is extensible. The 
context-free grammar, as will be described in more detail 
below, is specified within a text file. This text file may be 
replaced with a substitute text file that specifies a different 
context-free grammar. Moreover, the text file may be edited 
so as to alter the contents of the context-free grammar. As the 
context-free grammar is specified within a text file, the 
context-free grammar is human-readable. 

FIG. 1 depicts a computer system 10 that is suitable for 
practicing the preferred embodiment of the present inven- 
tion. The computer system 10 includes a central processing 
unit (CPU) 12 that oversees operations of the computer 
system. The CPU 12 may be realized by any of a number of 
different types of microprocessors. The computer system 
may also include a number of peripheral devices, including 
a keyboard 14, a mouse 16, a microphone 18, a video display 
20, and a loud speaker 22. The microphone 18 may be used 
to receive speech input from a speaker, and a loud speaker 
22 may be used to output audio content, such as speech. The 
computer system 10 may also include a network adapter 24 
for interfacing the computer system with a network, such as 
a local area network (LAN) or wide area network (WAN). 
Those skilled in the art will appreciate that a number of 
different types of network adapters may be utilized in 
practicing the present invention. The computer system 10 
may also include a modem for enabling the computer system 
to communicate with a remote computing resources over an 
analog telephone line. 

The computer system 10 additionally includes a primary 
memory 28 and a secondary memory 30, The primary 
memory may be realized as random access memory (RAM) 
or other types of internal memory storage known to those 
skilled in the art. The secondary storage 30 may take the 
form of a hard disk drive, CD-ROM drive, or other type of 
secondary storage device. In general, the secondary memory 
30 may be realized as a secondary storage device that stores 
computer-readable removable storage media, such as 
CD-ROMs. 

The primary memory 28 may hold software or other code 
that constitute a speech recognizer 32. The speech recog- 
nizer may take the form of a speech recognition engine and 
may include ancillary facilities such as a dictionary and 
alike. A suitable speech recognition engine is described in 
co-pending application, entitled "Method And System For 
Speech Recognition Using Continuous Density Hidden 
Markov Models," application Ser. No. 08/655,273, which 
was filed on May 1, 1996 and which is explicitly incorpo- 
rated by reference herein. Those skilled in the art will 
appreciate that portions of the speech recognizer 32 may 
also be stored on the secondary memory 30. The primary 
memory 28 holds a speech application program interface 
(API) 34 that works with the speech recognizer 32 to 
produce textual output corresponding to recognized speech 
within speech input. Application programs 36 may call the 
speech API 34 to receive the textual output that corresponds 
to the recognized portions of the speech input. These appli- 
cation programs 36 may include dictation applications, word 
processing programs, spreadsheet programs and the like. 
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The speech API 34 may include a text normalizer 38 for 
performing text normalization. The text normalizer 38 is the 
resource that is responsible for normalizing the text that is 
received by the speech API 34 from the speech recognizer 

5 32. The types of normalization that are performed by the text 
normalizer 38 will be described in more detail below. 

Those skilled in the art will appreciate that the text 
normalizer 38 need not be part of the speech API 34 but 
rather may exist as a separate entity or may be incorporated 

10 into the speech recognizer 32. The speech recognizer uses a 
context-free grammar 40 that is shown in FIG. 1 as being 
stored in secondary storage 30. Those skilled in the art will 
appreciate that the context-free grammar 40 may also be 
stored in primary memory 28. 

15 It should be appreciated that the computer system con- 
figuration depicted in FIG. 1 is intended to be merely 
illustrative and not limiting of the present invention. The 
present invention may be practiced with other computer 
system configurations. These other configurations may 

20 include fewer components than those depicted in FIG. 1 or 
may include additional components that differ from those 
depicted in FIG. 1. Moreover, the present invention need not 
be practiced on a single processor computer but rather may 
be practiced in multiprocessor environments, including mul- 

25 tiprocessors and distributed systems. 

FIG. 2 depicts an instance where the computer system 10 
is a client computer that has access to a network 44. This 
network 44 may be a LAN or a WAN. The network 44 may 

30 be the Internet, an Intranet or an Extranet. The client 
computer 10 includes networking support 42. This network- 
ing support 42 may include client code for a network 
operating system, a conventional operating system or even 
a web browser. The networking support 42 enables the client 

35 computer 10 to communicate with the server 46 within the 
network 44. The server 46 may hold media content 48, such 
as audio data, video data, textual data, or a hypertext 
document that is to be used by the client computer 10 in 
normalizing text. 

40 As was mentioned above, the text normalizer 38 normal- 
izes the text received from the speech recognizer 32 to 
produce normalized content. FIG. 3A depicts the flow of 
data between the speech recognizer 32, the text normalizer 
38, and an application program 36. In general, the speech 

45 recognizer 32 outputs text 50 that corresponds to recognized 
portions of speech within speech input received via the 
microphone 18 or stored in secondary storage 30. The text 
50 may output a word at a time to the text normalizer 38. 
Nevertheless, those skilled in the art will appreciate that the 

50 granularity of textual output produced by the speech recog- 
nizer 32 may vary and may include letters or even phrases. 
The text normalizer 38 produces normalized content 52 that 
it passes on to an application program 36. 

FIG. 3B shows an instance where the text normalizer 38 

55 produces normalized text 54 that it passes to the application 
program 36. The normalized text 54 includes substitute text 
that replaces the text 50 that was output by the recognizer 32. 
However, as shown in FIG. 3C, the text normalizer 38 may, 
alternatively, normalize the text to produce image data 56, 

60 such as a bitmap, metafile, or other representation of an 
image to the application program 36. The text 50 may 
specify an identifier of the representation of the image. In 
this instance, the text normalizer 38 replaces the identifier 
with the actual representation of the image that is identified 

65 by the identifier. 

FIG. 3D shows an instance wherein the text normalizer 38 
receives text 50 from the speech recognizer 32 and produces 
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audio content 58 as the normalized content. In this case, the 
text 50 may identify an audio clip or a file that holds audio 
data. This identifier is replaced with the associated audio clip 
for a file when normalized. Alternatively, the text may be a 
word or phrase for which the text normalizer 38 has an audio 
representation and wishes to substitute the audio represen- 
tation for the word or phrase. 

FIG. 3E depicts an instance wherein the text normalizer 
38 receives text 50 from the speech recognizer 32 and 
outputs a hypertext document 60 to the application program 
36. The text 50 may include an identifier, such as a uniform 
resource location (URL) that is associated with the hypertext 
document 60. When the text normalizer 38 receives the text 
50 for normalization, it replaces the text with the associated 
hypertext document 60. 

It should be appreciated that the text normalizer may 
combine different types of media content in the resulting 
normalized content 52 that is passed the application pro- 
grams. It should also be appreciated that the text normalizer 
38 may draw upon media content or resources within a 
network 44 to realize the normalization. For purposes of 
simplicity and clarity, the discussion below will focus on 
instances like that depicted in FIG. 3B wherein text 50 is 
normalized by the text normalizer 38 to produce normalized 
text 54. 

As was mentioned above, the context-free grammar 40 is 
stored as a text file. The text file holds specification of the 
rules of the context-free grammar. FIG. 4 depicts a logical 
organization of the text file 62. The text file 62 is divided into 
three major sections 64, 66, and 68. Each of the sections is 
delineated by a header or tag within the text file 62 (e.g., 
"[spacing]/' "[capitalization] " "[Rules J*). The first section 
is the spacing section 64 that specifies rules of the context- 
free grammar relative to spacing. These rules are imple- 
mented as a table. An example of a specification of rules 
within the table is as follows: 



left right substitution switch 

W {3} 
"0" {11} 



The table includes a "left" column that specifies a character 
that appears to the left, a "right" column that specifies a 
character that appears to the right, a "substitution" column 
that holds a proposed substitution for the right character, and 
a "switch" column that specifies whether the rule is in effect 
or not. The first rule in the above example specifies that if a 
period (i.e., the left character) is followed by a space (i.e., 
the right character), two spaces are to be substituted for the 
single space. The switch column holds a value of "1" and 
thus indicates that this rule is in effect. The second rule 
(specified underneath the first rule in the above example) 
indicates that a period is to be followed only a single space. 
The switch column, however, holds a value of "!1," which 
indicates that the rule is not in effect. 

It should be noted that a user interface, such as a property 
sheet, may be provided to enable a user to choose which of 
the spacing rules are in effect or not. The user choices are 
used to set the switch fields within the table. 

The capitalization section 66 is also organized as a table 
like that provided for the spacing section 64. This section 66 
holds capitalization rules such as the first letter of a word 
following a period that ends a sentence is capitalized. These 
rules may be also implemented as switchable so that a user 
may choose capitalization options. 
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The third section is the other rule section 68. The other 
rule section holds specification of a number of different rules 
that do not concern capitalization or spacing. This section is 
delineated by a "Rules" heading or tag. An example of such 
s a rule is as follows: 



<Digits> - [1+] <0..9> 
<0..9> - zero "0" 
<0..9> = one "1" 



<0..9> - nine "9" 



15 This rule indicates written digits may include one or more 
words containing digits and the rule specifies the substitu- 
tion of digits for written digit strings (i.e., "1" for "one"). 

FIG. 5 depicts the categories of other rules that may be 
implemented in accordance with the preferred embodiment 
of the present invention. The glossary category of rule 70 

20 specifies the replacement of text with the substitute text. A 
user may type in such substitutions as part of the glossary to 
enable shorthand ways of adding text to a document. The 
numbers category 72 contains rules that specify the substi- 
tution of the written form of words (i.e., a string of words) 

25 with a digital representation composed solely of digits. For 
example, "one hundred forty seven" is replaced by "147" by 
application of rules in this category 72 of rules. 

A dates category 74 contains rules that concern how 
spoken versions of dates are to be normalized. For example, 

30 the output text "april first nineteen ninety seven" is normal- 
ized to "Apr. 1, 1997." 

The currencies category 76 holds rules that normalize the 
specification of monetary amounts. For example, the phrase 
"ten cents" may be normalized to "10e" by rules in this 

35 category 76. 

The times category 78 holds rules that are used to nor- 
malize specification of time. For instance, the text "four 
o'clock in the afternoon" may be normalized to "4 p.m." by 
rules within this category 78. 

40 The fractions category 80 normalizes fractions into a 
mathematical form. Hence, the text "one-fourth" may be 
normalized to "Vi" by rules in this category 80. 

The acronyms category 82 normalizes text that specifies 
acronyms. For example, the text "CIA" may be normalized 

45 to "C. I. A," by rules in this category 82, 

The addresses category 84 contains rules for normalizing 
the specification of addresses. For instance, the string "one 
hundred fifty sixth" may be normalized to "156th" by rules 
within this category 84. 

50 The phone numbers category 86 normalizes the specifi- 
cation of phone numbers. When a user speaks a phone 
number, the speech recognizer may interpret the phone 
number as merely a sequence of digits. For example, the 
string "nine three six six three zero zero zero" may be 

55 normalized to "936-3000" by rules within this category 86. 
The city, state, zip code category 88 holds rules for 
specifying how a sequence of city, state, and zip code should 
appear. For example, the text "Seattle Washington nine eight 
zero five two" may be normalized to "Seattle, Wash. 98052" 

60 by rules within this category 88. 

The measurement units category 90 applies rules regard- 
ing the specification of measurements. For instance, the text 
"nineteen feet" will be normalized to "19 ft." by rules within 
this category 90. 

65 Those skilled in the art will appreciate that the text file 62 
may have a different format other that depicted within FIG. 
4. Moreover, the text file 62 may include rules for substi- 
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tuting text with audio content or video content. Rules may 
also be included for substituting text with hypertext docu- 
ments. Those skilled in the art will appreciate that the 
context-free grammar need not be specified as a text file in 
practicing the present invention. 

Those skilled in the art will further appreciate that addi- 
tional categories of rules other than those depicted in FIG. 
5 may be utilized. Still further, fewer categories of rules or 
different categories of rules may apply other than those 
depicted in FIG. 5. 

In order to utilize the context-free grammar 40, the text 
file 62 must be read and processed. FIG. 6 is a flowchart that 
depicts the steps that are performed to utilize the context- 
free grammar in normalizing text. First, the text file 62 that 
holds the context-free grammar is read (step 92 in FIG. 6). 
The content held therein are used to build a tree represen- 
tation for the context-free grammar (step 94 in FIG. 6). This 

{tree representation is used in parsing the input text received 
from the speech recognizer 32. Each path of the tree 
\specifies a portion of a rule for normalizing text. Thus, the 
text received from the speech recognizer 32 is processed by 
'the text normalizer 38 to compare the text with the rules 
contained within the tree and perform the appropriate nor- 
malization. Accordingly, text is received from the speech 
recognizer (step 96 in FIG. 6) and normalized (step 98 in 
FIG. 6). The tree acts largely as a parsing mechanism for 
deciding what portions of text received from the speech 
recognizer 32 should be normalized and how these portions 
should be normalized. 

FIG. 7 shows an example of a portion of the tree that is 
built by reading rules from the text file. The tree may be 
stored in binary form for optimization. This subtree specifies 
portions of the "Digits" rule that was set forth above as an 
example of rules provided within the text file 62. The tree 
includes a start rule node 100 followed by a digits rule node 
102. Nodes 104 and 106 specify that if the received text is 
"zero" the text is to be normalized and replaced with "0." 
Similarly, nodes 108, 110, 112, and 114 indicate the substi- 
tutions of "1" for "one" and "9" for "nine/' respectively. 

An example is helpful in illustrating how the subtree 
depicted in FIG. 7 may be used. Suppose that the text 
normalizer 38 receives the string "zero" the text normalizer 
starts at the start rule 100 and then determines that the string 
"zero" specifies a digit. It then follows the path to node 104 
and determines that there is a match. The text normalizer 
then uses the substitute or normalized string "0" specified in 
node 106 to normalize the received string. 

The rules are not necessarily applied on a word-by-word 
basis. Instead, the system seeks to apply the rule that will 
normalize the greatest length string within the text received 
from the speech recognizer 32. FIG. 8 is a flowchart illus- 
trating the steps that are performed in applying the rules. In 
general, a rule will be applied when at least a complete rule 
has been identified and no farther portion of a rule can be 
applied. Thus, in step 116 of FIG. 8, the text normalizer 
determines whether it is done normalizing a given portion of 
a text. If the text normalizer is done (see step 116 in FIG, 8), 
the text normalizer applies the rule that normalizes the 
greatest length of string in the non-normalized text (step 120 
in FIG. 8). It should be noted that there may be instances 
where multiple rules apply and there has to be a criteria for 
determining which rule to actually utilize. The preferred 
embodiment of the present invention utilizes the rule that 
normalizes the greatest portion of the non-normalized string. 
If, however, it is determined that there is further application 
of the rules to be done (see step 116 in FIG. 8), then the 
additional portions of the rules are applied (step 118 in FIG. 
8). 
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An example is helpful in illustrating when rules are 
applied and how normalization is performed. FIG. 9 depicts 
an example of text string "five chickens at twenty cents 
each." These words are stored within a text buffer 122 that 

5 is used by the text normalizer 38. The first word, "five," is 
processed by the text normalizer to determine whether there 
are any matching rules or not. There will be a match within 
the digit rule 126 for this word. Before applying the rule, the 
text normalizer 38 looks at the next word "chickens" as there 

10 is no rule that applies to the phrase "five chickens," the text 
normalizer 38 knows that it is done (see step 116 in FIG. 8) 
and applies the digit rule to replace "five" with "5." The 
value "5" is stored in a processed buffer 124 that holds the 
normalized text output. 

15 The system has no rule for "chickens" and thus does not 
pass the word on to the processed buffer 124. Similarly, the 
text normalizer 38 has no rule for the word "at" and thus 
would pass the word "at" on to the process buffer 124. When 
the text normalizer 38, however, encounters "twenty," it has 

20 a rule that applies (a number rule 128). Before actually using 
the rule, the text normalizer 38 looks at the next word 
"cents" and determines that there is no rule that normalizes 
the phrase "twenty cents." As a result, the number rule 128 
is applied to replace "twenty" with "20." Subsequently, a 

25 currency rule 130 is applied to replace "cents" with "e." 
Lastly, the word "each" is not normalized and is passed in 
literal form to the process buffer 124. 

As was mentioned above, the text normalizer 38 is used 
within the speech API 34. FIG. 10 is a flow chart that depicts 

30 the steps of how the text normalizer is used in this context. 
Initially, an application program 36 calls the speech API 34 
to receive textual interpretation of input speech (step 132 in 
FIG. 10). A speech recognizer processes the speech input to 
produce a textual output (step 134 in FIG. 10). The text 

35 normalizer 38 then normalizes the text as has been described 
above (step 138 in FIG. 10). The speech API 34 forwards the 
normalized content to the requesting application program 36 
(step 138 in FIG. 10). 

The preferred embodiment of the present invention has 

40 the benefit of being flexible and extensible. The context-free 
grammar is extensible in that content may be changed, 
added, or a complete new context-free grammar may be 
specified. FIG. 11 is a flow chart illustrating the steps that are 
performed to replace the context-free grammar with a new 

45 context-free grammar. The existing context-free grammar 
may be replaced by providing a new text file. A new text file 
holds specification for the new context-free grammar. The 
computer system 10 then reads the new text file for the 
context-free grammar (step 140 in FIG. 11). The information 

50 within the text file is utilized to build a new tree for the new 
context-free grammar (step 142 in FIG. 11), The new tree is 
then used to normalize text (step 144 in FIG. 11). 

The entire text file need not be replaced each time the user 
wishes to change the context-free grammar. Instead, the text 

55 file may be merely edited. FIG. 12 is a flow chart illustrating 
the steps that are performed to alter the context-free gram- 
mar in this fashion. Initially, the context-free grammar 
checks file as edited (step 146 in FIG. 12). The tree is revised 
accordingly by reading the contents from the edited text file 

60 altering the tree in a matching fashion (step 148 in FIG. 12). 
The revised tree may then be utilized to normalized text 
(step 150 in FIG. 12). 

While the present invention has been described with 
reference to a preferred embodiment thereof, those skilled in 

65 the art will appreciate that various changes in the form and 
detail may be made without departing from the intended 
scope of the present invention as defined in the appended 
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claims. For example, the text normalization may generally of a different context-free grammar and using the different 

be applied to substitute textual content with any of a number context-free grammar to normalize new text, 

of different types of media. Moreover, the text normalizer 17. The method of claim 15, further comprising the step 

need not be part of a speech API or part of system provided of using the file to build a tree for the context-free grammar 

resources. 5 that is used in the normalizing. 

We claim: 18. The method of claim 15 wherein the file is a text file. 

1. In a computer system having a speech recognition 19. The method of claim 15 wherein the file includes rules 
engine for recognizing content in input speech, a method regarding capitalization. 

comprising the computer-implemented steps of: 20. The method of claim 15 wherein the file includes rules 

receiving text corresponding to speech input from the 10 regarding spacing. 

speech recognition engine; 21. The method of claim 15 wherein the file contains 

applying a context-free grammar to identify substitute specification of a switch that identifies whether or not a rule 

cootent for the received text; and 15 '° b fL™ ed "? P art ° f , me «f tf t-free grammar. 

, . . , . , , 22. The method of claim 15, further comprising the step 

substituting the text with the substitute content. c u * * * t *u * u *u * / 

„ . • 15 or altering contents of the file so as to change the context- 

2. The method of claim 1 wherein the substitute content _ 

. , . tree grammar. 

compmes an alphanumeric strmg. 23. The method of claim 15, further comprising the steps 

3. The method of clarni 1 wherein the substitute content q{ ^ lext and aom^ng me additional 
compiles graphical content. text by applying another rule from the context-free grammar 

4. The method of clann 1 wherein the received text is an M (o K ^ additkmal ^ ^ non . textual 
identifier of media content ui a distributed system and the ^ ^ method of daim ^ wherein ^ non . textual 
substitute content is the media content. content includes ima e data 

5 The method of claim 4 wherein the received text is a ^ emod ^ c J m 23 wherein the non-textual 

uniform resource locator (URL). * * • 1 j j- j ♦ 

t tu + k^fi e u *u u-.^*, * 4 ♦ content mcludes audio data. 

6. The method of claim 5 wherein the substitute content i„ n nnm ^ tar . ow - „ nnr> i- ni : nn rt ™„ m 
, , 25 2o. In a computer system having an application program 

15 t J? ereX L j 01 ? 1 ^ 11 .' * , . . , ... 4 . . and a speech recognizer for recognizing portions of speech 

7. The method of claim 1 wherein the substitute content . ■ t , 4 ti > 7 - j- * ,u 
lus ^ UA wauu A wuw * m »ul»ui,uiv vumvut m S p eecn mpllt and cutputtmg text corresponding to the 

is a hypertext document. recognized portions of speech, a method comprising the 

8. The method of claim 1 wherein the substitute context ^ implemented steps of: 

comprises audio content. „ ,• ■ » ^ /xd a 4 l , 

9. Tht method of claim 1 wherein the context-free gram- 30 P r0V1 f ^ application program interface (API) that 
mar contains at least one rule for substituting the substitute includes a text normalizer, 

content for the received text. receiving text from the speech recognizer at the text 

lft. The method of claim 1 wherein the computer system normalizer; 
runs an application program and wherein the substitute normalizing the text by applying a rule from a context- 
content is forwarded to the application program. free grammar to alter contents of the text and produce 

11. The method of claim 1 wherein the received text is a normalized text; and 

string of words and the substitute content contains a series passing the normalized text to the application program, 

of digits corresponding to at least some of the string of 27. The method of claim 26 wherein the API is a speech 

words. API that provides textual output corresponding to recog- 

12. The method of claim 1 wherein the received text is a 40 nized speech input to the application program. 

string of words specifying an address and the substitute 28. Trie method of claim 26 wherein the application 

content includes a series of digits specifying at least a program requests text from the API to prompt the passing of 

portion of the address. the normalized text to the application program. 

13. The method of claim 1 wherein the received text is a 45 29. A computer system, comprising: 

string of words identifying an amount of currency and the a speech recognizer for recognizing portions of speech in 

substitute content includes digits and a currency symbol that speech input and producing textual output correspond- 

specifies the amount of currency. ing to the recognized portions of speech; 

14. The method of claim 1 wherein the received text is a a context-free grammar that contains rules for normaliz- 
string of words that specifies a fraction and the substitute 5Q ing text; and 

content includes digits and a mathematical operation that in a text normalizer that applies at least one rule from the 

conjunction specify the fraction. context-free grammar to the textual output from the 

15. In a computer system having a speech recognizer for speech recognizer. 

recognizing portions of speech in speech input, a method 30. The computer system of claim 29 wherein the text 

comprising the computer-implemented steps of: $s normalizer is part of an application program interface (API), 

providing a file that sets forth rules of a context-free 31. In a system having a speech recognition engine for 

grammar for normalizing text; recognizing content in input speech, a computer-readable 

receiving text from the speech recognizer, said text cor- medium holding computer-executable instructions for per- 

responding to speech input; and forming a method comprising the computer-implemented 

normalizing at least a portion of said text to replace the 60 ste P s 

portion of said text with a normalized alphanumeric receiving text corresponding to speech input from the 

string, said normalizing comprising applying a rule speech recognition engine; 

from the context-free grammar to replace the portion of applying a context-free grammar to identify substitute 

said text being normalized with the normalized alpha- content for the received text; and 

numeric string. 65 substituting the text with the substitute content. 

16. The method of claim 15, further comprising the steps 32. The computer-readable medium of claim 31 wherein 
of replacing the file with a substitute file that sets forth rules the substitute content comprises an alphanumeric string. 
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33. The computer-readable medium of claim 31 wherein 
the substitute content comprises graphical content. 

34. The computer-readable medium of claim 31 wherein 
the received text is an identifier of media content in a 
distributed system and the substitute content is the media 
content. 

35. The computer-readable medium of claim 34 wherein 
the received text is a uniform resource locator (URL). 

36. The computer-readable medium of claim 35 wherein 
the substitute content is a hypertext document. 

37. The computer-readable medium of claim 31 wherein 
the substitute content is a hypertext document. 

38. The computer-readable medium of claim 31 wherein 
the substitute content comprises audio content. 

39. The computer-readable medium of claim 31 wherein 
the received text is a string of words and the substitute 
content contains a series of digits corresponding to at least 
some of the string of words. 

40. The computer-readable medium of claim 31 wherein 
the received text is a string of words specifying an address 
and the substitute content includes a series of digits speci- 
fying at least a portion of the address. 

41. The computer-readable medium of claim 31 wherein 
the received text is a string of words identifying an amount 
of currency and the substitute content includes digits and a 
currency symbol that specifies the amount of currency. 

42. The computer-readable medium of claim 31 wherein 
the received text is a string that specifies a fraction and the 
substitute content includes digits and a mathematical opera- 
tion that in conjunction specify the fraction. 

43. In a computer system having a speech recognizer for 
recognizing portions of speech in speech input, a computer- 
readable medium holding computer-executable instructions 
for performing a method comprising the computer- 
implemented steps of: 

providing a file that sets forth rules of a context-free 
grammar for normalizing text; 

receiving text from the speech recognizer, said text cor- 
responding to speech input; and 

normalizing at least a portion of said text to replace the 
portion of said text with a normalized alphanumeric 
string, said normalizing comprising applying a rule 



30 



35 



from the context-free grammar to replace the portion of 
said text being normalized with the normalized alpha- 
numeric string. 

44. The computer-readable medium of claim 43 wherein 
the method further comprises the steps of replacing the file 
with a substitute file that sets forth rules of a different 
context-free grammar and using the different context-free 
grammar to normalize new text, 

45. The computer-readable medium of claim 43 wherein 
the file is a text file. 

46. The computer-readable medium of claim 43 wherein 
the file contains specification of a switch that identifies 
whether or not a rule is to be used as part of the context-free 
grammar, 

47. The computer-readable medium of claim 43 wherein 
the method further comprises the step of altering contents of 
the file to change the context-free grammar. 

48. In a computer system having an application program 
and a speech recognizer for recognizer portions of speech in 
speech input and outputting text corresponding to the rec- 
ognized portions of speech, a computer-readable medium 
holding computer-executable instructions for performing a 
method comprising the computer-implemented steps of: 

providing an application program interface (API) that 

includes a text normalizer; 
receiving text from the speech recognizer at the text 

normalizers; 

normalizing the text by applying the rule from a context- 
free grammar to alter contents of the text and produce 
normalized text; and 

passing the normalized text to the application program. 

49. The computer-readable medium of claim 48 wherein 
the API is a speech API that provides textual output corre- 
spond to recognized speech input to the application pro- 
gram. 

50. The computer-readable medium of claim 48 wherein 
the application program requests text from the API to 
prompt the passing of the normalized text to the application 
program. 
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It is certified that error appears in the above-identified patent and that said Letters Patent is hereby 
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Column 1, line 48, replace "receive" with 
- -received-- - 

Column 8, line 18, replace "process" with 
- -processed- - . 

Column 8, line 27, replace "process" with 
--processed- - . 

Column 11, line 28, after "string" insert 
--of words--. 

Column 12, line 28, replace " normal izers" 
with - -normal izer-- . 
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